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The wild Oryza species are rich in genetic diversity and are good resources for modem breeding of rice 
varieties. The reliable ex situ conservation of various genetic resources supports both basic and applied rice 
research. For this purpose, we developed PCR-based and co-dominant insertion/deletion (ESTDEL) markers 
which enable the discrimination of the genome types or species in the genus Oryza. First, 12,107 INDEL 
candidate sequences were found in the BAC end sequences for 12 Oryza species available in public data- 
bases. Next, we designed PCR primers for INDEL-flanking sequences to match the characteristics of each 
INDEL, based on an assessment of their likelihood to give rise to a single or few PCR products in all 102 
wild accessions, covering most Oryza genome types. Then, we selected 22 INDEL markers to discriminate 
all genome types in the genus Oryza. A phylogenetic tree of 102 wild accessions and two cultivars according 
to amplicon polymorphisms for the 22 INDEL markers corresponded well to those in previous studies, 
indicating that the INDEL markers developed in this study were a useful tool to improve the reliability of 
identification of wild Oryza species in the germplasm stocks. 
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Introduction 

Rice {Oryza sativa L.) is one of the most important crops in 
many countries and so many breeders have made extraordi- 
nary efforts to improve yield, grain quality and other agro- 
nomical traits in cultivated rice. The completion of the rice 
genome sequence has promoted molecular breeding and 
contributed to shorten of the periods for development of new 
varieties. On the other hand, large scale and long-term culti- 
vation of a few varieties have resulted in a genetic break- 
down of modem varieties, for example, in terms of a resis- 
tance to biotic stresses (Kottapalli et al. 2010). Wild species 
are known to be a source of useful genes for potential use in 
modem rice breeding. 

The genus Oryza is composed of 23 species, two cultivat- 
ed and 2 1 wild (reviewed by Vaughan and Morishima 2003). 
On the other hand, regarding chromosome stracture, Oryza 
species are classified into nine genomic types, A to J (I is 
absent), according to the pairing affinity of meiotic chromo- 
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somes in hybrid plants. AA wild species are closely related 
to cultivated rice, O. sativa and have been useful resources 
for current breeding programs (Jena 2010) because of high 
cross compatibility with cultivars. Non-AA species are dis- 
tantly related to cultivars and seldom used for breeding, due 
to lower affinity of the meiotic chromosomes and conse- 
quently seed sterility of hybrids with cultivated rice. How- 
ever, non-AA species contain many agronomically important 
genes (Jena 2010, Nonomura et al. 2010), so availability of 
genetically reliable populations of Oryza species would be 
beneficial in rice breeding. 

The ex situ conservation of wild species are always faced 
with problems arising from outcrossing of species or 
ecotypes, mixed collections of different species in natural 
habitats and artificial contamination during conservation. 
The great efforts by many taxonomists have solved these 
problems and enabled us to classify and conserve the Oryza 
species reliably. However, it is still difficult for most re- 
searchers to classify the Oryza species based on morpholog- 
ical and physiological traits. 

In this study, we aimed to develop new molecular mark- 
ers for even inexperienced researchers to discriminate spe- 
cies or genome types of wild Oryza accessions easily by 



ESfDEL markers to discriminate Oryza genome types 



247 



polymerase chain reaction (PCR) and gel electrophoresis. 
Single nucleotide polymorphisms (SNPs) and insertion/ 
deletion (INDEL) markers are the most commonly used 
markers in plants, because they are easy to use, PCR-based, 
co-dominant and relatively abundant (Pacurar 2012). Simple 
sequence repeat (SSR) markers are also useful, but being 
rapidly replaced with SNP markers, because of the charac- 
teristics of SNPs more stable and amenable to automation 
(reviewed by McCouch et al. 2010). Numerous SNP mark- 
ers have been developed and used for genomic selection, 
genomic association and quantitative-trait-loci mapping for 
inbreeding populations of modem varieties with local varie- 
ties or closely-related wild species. In this study, we devel- 
oped INDEL markers, because they have a merit in easy 
detection of polymorphims by PCR and direct gel electro- 
phoresis, while SNPs information might be converted into 
CAPS (cleaved amplified polymorphic sequences) or 
dCAPS (derived CAPS) and restriction endonuclease cleav- 
age is necessary for SNP detection prior to gel electrophore- 
sis (Michaels and Amasino 1998, Neff et al. 1998). Here, we 
show that 22 INDEL markers are available to discriminate 
species or genome types reliably in the genus Oryza. 

Materials and Methods 

Plant materials 

All the wild accessions of the genus Oryza were provided 
by the National Institute of Genetics (NIG) and the National 
Bioresource Project (NBRP), Japan (Nonomura et al. 2010) 
and are listed in Table 1. Of 282 wild accessions of the NIG 
core collection, 42 of rank 1 and 60 of rank 2 were used. The 
accessions were selected to cover 20 of the 21 wild Oryza 
species (Fig. 1) and four or more accessions were analyzed 
for each of most species. In addition, O. sativa cv. 
Nipponbare and cv. Kasalath were used as standard varieties 
representing ssp. japonica and indica, respectively. All 
plants were grown in the summer on 2005 and 2006 in a field 
of the NIG, Mishima, Japan. Genomic DNA was extracted 
from mature leaves by the CTAB method (Rogers and 
Bendich 1988). 

Identification of INDEL markers 

The bacterial artificial chromosome (BAC)-end sequences 
(BESs) released by the Oryza Map Alignment Project 
(OMAP) (Ammiraju et al 2006, Wing et al 2005) and 
available on public databases were used to design PCR 
primers. The OMAP BESs derived from 12 wild Oryza 
species (Table 2) were mapped in silico to the reference 
O. sativa cv. Nipponbare genome (IRGSP Build 5) (http;// 
rgp.dna.affrc.go.jp/E/IRGSP/Build5/build5.html) by a 
BLAST search (Altschul et al. 1997). In this step, we select- 
ed the BESs that displayed high similarity to the reference 
sequences except for INDEL regions. For further selection, 
the gap size was set in the range from 5 1 to 2,000 base pairs 
(bp). The primer pair to amplify each of the BESs selected 
above was designed by PrimerS software (http://primer3. 



sourceforge.net). The sequences of the reference genome 
and all primer pairs were put into e-PCR software (http:// 
www.ncbi.nlm.nih.gov/projects/e-pcr/) and the primer pairs 
virtually amplifying a single band were selected. Each of the 
pairs was fiirther screened for giving rise to a single band 
against its original BES in e-PCR software. For further lim- 
itation of the number of candidates, we chose primer pairs 
that amplified PCR products 50 to 1,500 bp long. In addi- 
tion, the INDEL size was set at 10% or more of each PCR 
product for easy detection of the polymorphism using aga- 
rose gel electrophoresis. 

Amplicon polymorphism assay 

PCR was performed under conditions of 94°C for 2 min 
and a subsequent 35 rounds of 94°C for 1 min, 56°C for 
1 min and 72°C for 1 min, followed by 72°C for 2 min, using 
GoTaq Green Master Mix kit (Promega) and a Tl 
Thermocycler (Biometra). Amplicon polymorphisms of the 
INDEL markers were identified by electrophoresis in 3% 
gels of Certified low range ultra agarose (Bio-Rad) followed 
by ethidium bromide staining. 

To detect minute differences in electrophoretic distances 
more precisely, we also performed fragment analysis of the 
PCR products using the fluorescent-labeling fragment ana- 
lyzing system of a capillary sequencer (Applied Biosystems 
(ABI) PRISM 3130x1) in accordance with the instructions 
for microsatellite analysis by ABI. ABI PRISM fluorescent 
primers, fluorescently 5'-labeled with PET, 6-FAM, NED, 
or VIC dyes, were used for forward INDEL primers. To 
avoid the "additional A" problem, in which unexpected 
addition of an adenine residue at the tail of PCR products 
often makes results vmstable, seven nucleotides were artifi- 
cially added to the 5' end of reverse primers as recommend- 
ed by ABI. The PCR product lengths were analyzed using a 
3130x1 Genetic Analyzer and GeneMapper software (ABI). 

DNA sequencing 

PCR products amplified with naked INDEL primer pairs 
were directly sequenced using a BigDye Terminator v3.1 
Cycle Sequencing Kit (ABI) according to the manufactur- 
er's instructions. Some products were cloned into vector 
pCRII (Invitrogen), then amplified and sequenced with uni- 
versal M13 forward and reverse primers. 

Phylogenetic analysis 

A rooted phylogenetic tree was drawn by the UPGMA 
method (Sokal and Michener 1958) on the basis of the PCR 
product lengths of INDEL markers determined manually 
following gel electrophoresis. Data from the fragment anal- 
ysis were used only to judge the size of DNA fragments 
showing similar, but slightly different mobilities in the gel. 
The tree construction was based on Nei's chord distance 
(Nei et al. 1983) and performed using Populations version 
1.2.31 software (http://bioinformatics.org/~tryphon/ 
populations/) with 1,000 bootstraps. 
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Table 1. The accessions of wild Oryza species used in tliis study 



Acc No. 


Species 


Genome type 


Origin 


Acc No. 


Species 


Genome type 


Distribution 


WUlUu 


rufipogon 


A A 

AA 


India 


W 1 jol 


ofiicinalis 




Malaysia 


W7r\ 


rufipogon 


A A 

AA 


India 


W IojU 


officinalis 




unknown 


TO /I 


rufipogon 


A A 

AA 


Philippines 


WI] 1 '2 1 

W 1 1 J 1 


officinalis 


CC 


India 


W loDO 


ruppogon 


A A 

AA 


1 nailand 


\\/] 'im 
W 1 jUz 


officinalis 


CC 

CC 


Philippines 


\x7i m 1 

W lyzl 


rujipogon 


A A 

AA 


1 nailana 


\1 7 1 O 1 A 

W lol4 


ojjicinatis 


CC 

CC 


Sri Lanka 




rujipogon 


A A 

AA 


India 


XI71 OAC 

WIoUd 


rhizomatis 


CC 

CC 


Sri Lanka 


WuoJU 


rujipogon 


A A 

AA 


Myanmar 


W 1 jz / 


eichingeri 


CC 

CC 


Uganda 


W1236 


rujipogon 


AA 


unknown 


W1519 


eichingeri 


CC 


Uganda 


W1807 


rufipogon 


AA 


Sri Lanka 


W1522 


eichingeri 


CC 


Uganda 


W iy4D 


rujipogon 


A A 

AA 


unknown 


WlDzD 


eichingeri 


CC 

CC 


Uganda 




rujipogon 


A A 

AA 


unknown 


Wl loo 


latifolia 


CCTW\ 

CCUL) 


Mexico 


WiU /8 


... 

rujipogon 


AA 


Australia 


Wl 197 


latijolia 


CCDD 


Colombia 


W2zoi 


rufipogon 


A A 

AA 


Cambodia 


WzzUU 


latifiylia 


CCJJJJ 


Brazil 




barthii 


A A 

AA 


Sierra Leone 


wuuiy 


latifi)lia 


CCJJJJ 


Cuba 


\X71 ^QO 


barthii 


A A 

AA 


Cameroun 


WI] 101 

W 1 1 cS 1 


latifi)lia 


CCT\T\ 

CCL>U 


Panama 


wuoyo 


barthii 


A A 

AA 


Guinea 


WOUi / 


alta 


CCUL) 


Surinam 


WU /zU 


barthii 


A A 
AA 


Man 


W 1 1 oZ 


alta/latifi)lia 


CCJJU 


Guyana 


WU747 


barthii 


AA 


Mali 


WUUlo 


alta 


CCDD 


Paraguay 


W1646 


barthii 


AA 


Tanzania 


W1147 


alta 


CCDD 


Surinam 


Wl 169 


glumaepatula 


AA 


Cuba 


W0613 


grandiglumis 


CCDD 


Brazil 


W2145 


glumaepatula 


AA 


Brazil 


Wl 194 


grandiglumis 


CCDD 


Brazil 


Wziyy 


glumaepatula 


AA 


Brazil 




grandiglumis 


CCDD 


Brazil 


W 1 loj 


glumaepatula 


A A 

AA 


Suriname 


W 14/0 


grandiglumis 


CCDJJ 


Brazil 


W1187 


glumaepatula 


AA 


Brazil 


W148U(B) 


grandiglumis 


CCDD 


Brazil 


W 1 1 Vo 


glumaepatula 


A A 

AA 


Colombia 


wir\f\r\Q 
WUUUo 


aus tr aliens is 




Australia 


W IDZD 


meridionalis 


A A 

AA 


Australia 


WI] 

W iDZO 


australiensis 


cJi 


Australia 


W loJD 


meridionalis 


A A 

AA 


Australia 


WT] ATT 

W IV jZ 


australiensis 


lit 


Australia 


w izy / 


meridionalis 


A A 

AA 


Australia 


WzUKu 


australiensis 


lit 


Australia 


W loz / 


meridionalis 


A A 

AA 


Australia 


■\17T 1 A/I 

WzlU4 


australiensis 


tt 


Australia 


WZUoV 


meridionalis 


A A 
AA 


Australia 


W1 1 ZlA 1 
W 14U1 


brae hy an th a 


W 
rr 


Sierra Leone 


wzu /y 


meridionalis 


A A 
AA 


Australia 


W1 1711 
W 1 / 1 1 


brachyantha 


J^ J^ 


Cameroun 


1 /^"i 
WZIUj 


meridionalis 


A A 
AA 


Australia 


\171 A(\H"ti\ 

W L^yJ /{D ) 


brachyantha 


r j' 


A/Tali 

iviaii 


\I 7 1 yl 1 -I 


longistam inata 


A A 
AA 


Sierra Leone 


WI] lC\f>. 

W 1 /uo 


brachyan tha 


rr 


cnau 


wi ^r^c 

W 1 jUo 


longistam inata 


A A 
AA 


Madagascar 


WI(\(\C\'X 
WUUUj 


granulata 


C^C 


India 


WU043 


longistam inata 


A A 

AA 


Gambia 


W/AAAT/ DA 
WUUu 


granulata 


CC 


Thailand 


WU /Uo 


longistam inata 


A A 
AA 


Guinea 


\17AAAC 


granulata 




Ceylon 


W 04U 


longistam inata 


A A 

AA 


Congo 


WUol J 


granulata 


CC 


Myanmar 


W 10Z4 


longistaifiinata 


A A 
AA 


Cameroun 


WI] 'i^fi 
W 1 J JO 


meyeriana 


KJKJ 


Malaysia 


W 1 J 14 


punctata(2x) 




Kenya 


W 1 J4o 


meyeriana 


CC 


Malaysia 


W1590 


punctata(2x) 


DD 
OD 


Cameroun 


W7] TO 

W 1 JDz 


meyeriana 


CC 


Malaysia 


W lUx4 


punctata(4x) 




Ghana 


7 1 I^A 
W 1 Jj4 


meyeriana 


CC 


Malaysia 


W 14Uc> 


punctata ( 4x) 




Nigeria 


W looU 


meyeriana 


CC 

yj\j 


Malaysia 


W 14 /4(r> ) 


punctata(4x) 




L.naa 


WZUuo 


meyeriana 


CC 

CC 


unknown 


W IZ 1 J 


minuta 




Philippines 


\\/AAAl 


ridleyi 


T-TT-TTT 
JrlJrLJJ 


Thailand 


W1331 


minuta 


BBCC 


Philippines 


W0604 


ridleyi 


HHJJ 


Malaya 


WOO 16 


minuta 


BBCC 


Surinam 


W2033 


ridleyi 


HHJJ 


Thailand 


W1319 


minuta 


BBCC 


Philippines 


W2035 


ridleyi 


HHJJ 


Thailand 


W1323 


minuta 


BBCC 


Philippines 


W1220 


longiglumis 


HHJJ 


Indonesia 


W1328 


minuta 


BBCC 


Philippines 


W1215 


longiglumis 


HHJJ 


Indonesia 


W1329 


minuta 


BBCC 


Philippines 


W1224 


longiglumis 


HHJJ 


Indonesia 


W0002 


officinalis 


CC 


Thailand 


W1228 


longiglumis 


HHJJ 


Indonesia 



Flow cytometry 

In addition to wild species, we used a haploid plant, pro- 
duced from O. sativa cv. Nipponbare by the slightly modi- 
fied method of the anther culture described by Niizeki and 
Oono (1968), to normalize the nuclear DNA contents of wild 



species. The preparation of samples and flow cytometric 
analysis was performed in accordance with Miyabayashi et 
al. (2007) with slight modifications. The sample nuclei were 
extracted from a piece of adult leaves (5x5 mm). The leaves 
were chopped with a razor blade in 400 nL of extraction 
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wild Oryza species 



Marker ID 


AA 


BB 


BBCC 


cc 


CCDD 


EE 


FF 


GG 


HHJJ 




ruf 




pill 


incr 


Ion 


pun 


pun 


inin 


off 


sic 


rhi 


lat 


alt 




aus 


bra 


gta 


mey 


rid 


log 


Ch01-301W 


352 






375 






























296/410 


Ch02-269W 


259 






285 






























120 




Ch02-308W 


112 






160 


































Ch02-342W 


209 






























405 










Ch02-343W 


380 






210, 
370 


























140 


120/136 




298 










280 


























m 


































Ch03-173W 


369 






130 






























135 








































Ch03-363W 


461 
















550 






















































Ch04-276G 


132 






























150 






150 






















01104- 31'>W 


178 








200 




































































v^nUD-Uo / W 


195 










145 






























Ch05-070W 


263 






















371* 


20* 

371* 


371 




310 


320 


336 




Ch05-109G 


398 










180 






























Ch05-202W 


148 






























130 










Ch05-277W 


326 




































362 




Ch06-269W 


299 






600 




































180 






























210 






180* 




Ch06-306W 


124 




































ISO 












185 = 






















(:h07-233W 


281 






440 












281» 


281* 












] 




























41 


^1/350^ 


Ch08-006W 


234 








270 




























0 




Ch09-037G 


161 






























140 










ChlO-044G 
































150 











Fig. 1. A diagram of the polymorphic PCR band pattern of 20 wild Oryza species obtained using 22 INDEL markers. The color pattern indicates 
how many polymorphic bands were detected in each of 22 INDEL markers for 20 wild Oryza species. The species indicated with a same color 
exhibited an identical band size for each marker on the gel electrophoresis. The number in each color indicates the estimated band size (bp). For 
example, the Ch7-233W marker gave rise to 5 different bands: 281 bp (magenta), 440 bp (yellow), 200 bp (green), 300 bp (light blue) and 181/ 
350 bp (dark blue). In BBCC and CCDD tetraploid species, this marker gave two bands from both BB and CC types, colored green and light blue. 
Two numbers with a slash (ex., 181/350) indicate that in tetraploid species (for example, HHJJ), the genome types from which both bands were 
raised are uncertain. Two numbers with a comma (ex., 210, 370) indicate that the marker amplifies two PCR bands in a diploid species. The 
numbers with asterisks indicate that those bands are polymorphic in several accessions within the same species, ruf, rufipogon; bar, barthii; glu, 
glumaepatula; mer, meridionalis; Ion, longistaminata; pun, punctata; min, minuta; off, officinalis; eic, eichingeri; rhi, rhizomatis; lat, latifolia; 
alt, alta; gra, grandiglumis; aus, australiensis; bra, brachyantha; gta, granulata; mey, meyeriana; rid, ridleyi; log, longiglumis. The seven marker 
IDs colored red are a minimum marker set useful for classifying species and genome species in the genus Otyza. 



solution A (Partec) and incubated for 30 min. The superna- 
tant was filtered with 50 |am- and subsequently with 20 |am- 
CellTrics filters. 160 |aL of staining solution B (Partec) was 
added to the supernatant and incubated for 30 min. The 
extract from the wild species sample was mixed with an 
equal volume of that from the haploid plant and supplied for 
the measurement of nuclear DNA contents by ploidy analyz- 
er PA system (Partec) according to the manufacturer's 
instruction. 

Online disclosure of marker information 

Information on the 22 INDEL markers obtained in this 
study is open access on the integrated rice science database. 



Oryzabase (Yamazaki etal. 2010, http://www.shigen.nig.ac. 
jp/rice/oryzabaseV4/). 

Results 

Selection of INDEL markers to discriminate wild Oryza 
species 

To design PCR primers for discriminating wild relatives 
of rice, BESs derived from twelve wild species, including 
O. nivara and O. coarctata in accordance with the taxonomy 
in Ammiraju et al. (2006), were selected from public 
databases and compared to the reference O. sativa cv. 
Nipponbare genome. This permitted the identification of 
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Table!. The number of putative INDELs detected in silico in this 
study 



Oryza species* 


Genome 
type 


No. INDELs 
candidate sites 
in this study 


No. BESs used 
in this study 


ruppogon 


A A 

AA 




nt\ no 
/U,yoz 


nivara 


A A 
AA 




lUO, 1Z4 


glaberrima 


A A 

AA 




00,621 




BB 


1 270 


68,384 


minuta 


BBCC 


1,911 


169,460 


officinalis 


CC 


1,504 


101,091 


aha 


CCDD 


1,370 


128,732 


australiensis 


EE 


705 


135,769 


brachyantha 


FF 


550 


67,364 


granulata 


GG 


538 


138,171 


ridleyi 


HHJJ 


831 


204,729 


coarctata 


HHKK 


1,971 


195,285 






12,107 


1,452,912 



* The taxonomy is in accordance with Ammiraju et al. (2006) 



12,107 putative INDELs in the genomes of wild species 
compared with cultivated rice (Table 2). 3,244 INDEL loci 
were shared by 3 to 10 species each (referred to hereafter as 
multi-INDELs) (data not shown). The multi-INDELs were 
further reduced to 120 loci based on the degree of conserva- 
tion of INDEL-flanking sequences and examination of the 
lengths of PGR products (see Methods). In addition, we 
selected 104 INDELs, each of which was detected with 



reference to a single BES from O. mfipogon, O. punctata, 
O. brachyantha, or O. granulata (single INDEL). In total, 
224 markers were investigated for their applicability in clas- 
sification of wild rice relatives. A total of 40 rice micro- 
satellite (RM) markers (McCouch et al. 2002) were also 
investigated. However, all of the RM markers gave intraspe- 
cifically polymorphic PGR fragments only in AA-genome 
species (data not shown); therefore, they were excluded 
from subsequent analyses. 

A total of 22 wild accessions representing the 20 Oryza 
species in addition to two rice varieties, O. sativa ssp. 
japonica cv. Nipponbare and ssp. indica cv. Kasalath, were 
used for the first assessment of INDEL markers. We screened 
the 224 primer sets, focusing on their ability to amplify PGR 
products in all genome types, especially in distantly related 
EE, FF, GG or HHJJ species, since this study aimed to estab- 
lish PGR-based markers applicable for all Oryza species. In 
this screening, 60 of 120 multi- and 13 of 104 single 
INDELs were selected and further analyzed (Supplemental 
Fig. 1). More efficient amplification in distantly related spe- 
cies by multi-INDELs than by single INDELs suggested that 
INDEL-flanking sequences are diverse among species, and 
that careful mining of flanking sequences conserved across 
multiple species is important for efficient selection of 
INDEL markers. In the second assessment, 22 out of 73 
INDELs that gave clear PGR bands in 102 wild accessions 
were selected (Table 3 and Supplemental Fig. 1) and further 
analyzed. The 22 INDEL loci were widely distributed on all 
Nipponbare chromosomes except for chromosomes 1 1 and 



Tables. The 22 INDEL markers developed to discriminate wild Oryza species in this study 



Marker ID* 


Primer sequences 


INDEL 


BES ID 


Forward 


Reverse 


size (bp) 


Ch01-301W 


TTTGTTCATCTGCATCAACTCA 


TGATCATACGATGGAAAGGTAGA 


56 


OR_ABa0229J12.r 


Ch02-269W 


TTCGGTAAAGAACCTCTTGAGTG 


GATTCTAGTGCCATTTCGCC 


135 


OP_Ba0011A10.r 


Ch02-308W 


CCTTAAGAAATTGTTAGTTCAGGCA 


CTTCTTCTTGCTAGCAGTTGTCT 


64 


OP_Ba0008P06.f 


Ch02-342W 


TGGCAGACCATCTGAGAGAG 


CCAGAAATCAGCAATCTGCAA 


196 


OB_Ba0075C24.f 


Ch02-343W 


TTCTCCACCCTCGTCTTCTC 


TTGCTCTCCAGCTTCTCCTC 


245 


OB_Ba0060I13.r 


Ch03-128W 


TGATTCCTTGGTAGTCTTCCC 


TGCTCACCATAGACTCTTCCA 


76 


OR_ABa0265L24.r 


Ch03-173W 


AGGCAAAGTTCAGAATGCAA 


AGCGGCAATAGCCATCTAAG 


217 


OG_ABa0011K22.r 


Ch03-363W 


TTTCCGTCAGATTGCACATT 


TGATTTACCACCAAACAGTAAGTCA 


98 


OB_Ba0073N03.f 


Ch04-276G 


GGTACCTCCAGGAATCCCAT 


CCAATGTGCATGGCATTTAG 


229 


OG_ABa0018K10.r 


Ch04-312W 


TTCTTTGTCGTGATCGCAAG 


TTTCATTCAACGTGGTGGTT 


218 


OR_ABa0207K04.f 


Ch05-067W 


CCCATTCCCTATACCTGTGTAAA 


AGAATCACAGAGGATCCGAA 


60 


OR_ABa0151A19.r 


Ch05-070W 


GGAAGAAAGCAAGGATGCAA 


TCTGCTGTCATATGCTTGGG 


108 


OA_BBa0041D18.f 


Ch05-109G 


TGATGATGAAATACCTTGCCC 


TGTATGGCTGCATTTGCACT 


276 


OG_ABa0128B15.f 


Ch05-202W 


TCTTCAAGAAACCAGAAGATCTGA 


TGGATGTGCTTCTGACGCTA 


83 


OG_ABa0074B07.r 


Ch05-277W 


CCAGAACCGTTGTTCCTGTT 


GGATGTTGAGAAGGGTGGAA 


64 


OR_ABa0019H23.r 


Ch06-269W 


CCAATGAAATGCAGTCGAGA 


GGACACATTCAACCCTCACA 


79 


OC_Ba0241E15.f 


Ch06-300W 


CACAGACAGTGTCCAGAGTTCAG 


GCCTTGAAAGTTGAAACCCA 


180 


OG_ABa0049K22.f 


Ch06-306W 


GAGCCCTCGGTTAGATGTGA 


CGTGCCGTATATGTCCTGAA 


61 


OM__Ba0207M10.f 


Ch07-233W 


TCACCAAGCCAATTCTTCTTC 


CCTCCTAAACCAGACTGCACA 


100 


OR_ABa0269N20.f 


Ch08-006W 


TTGCATTGAATCAGTAGGTCAA 


AGGACCTTGATTTGCCATGT 


138 


OB_Ba0054E21.f 


Ch09-037G 


CCGGAGTCCTATCCACAGGT 


TTGGGCATCACCTGATAAGA 


202 


OG_ABa0001G03.f 


ChlO-044G 


CTTGTTTCCAGCAAGGTTGG 


TACCAGGTTTGCTGCATGTT 


432 


OG_ABa0068J15.r 



* The sufSxes W and G indicate that the markers are multi- and single {granulata)-WBiELs, respectively. 
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12 and eighteen of them were multi-INDELs (suffixed "W" 
in Table 3). The size of the 22 INDELs ranged from 56 to 
432 bp and averaged 150.8 bp (Table 3). 

Validation of 22 INDEL markers in discrimination of wild 
Oryza species 

Each of the 22 selected INDEL markers exhibited various 
interspecific amplicon polymorphisms in the genus Oryza 
(Fig. 1). To validate the reliability of these markers for dis- 
criminating Oryza species, they were applied to phylogenet- 
ic analysis of 102 wild accessions and two cultivars (Fig. 2). 
The band sizes of PGR products were manually determined 
following agarose gel electrophoresis. In addition, for de- 
tecting minute differences in electrophoretic distances more 
precisely, we attempted fragment analysis. However, it was 
difficult to completely replace the manual method by frag- 
ment analysis, because the size of amplicons varied even 
among wild accessions from the same species. This may be 
due to numerous inter- and intraspecific SNPs and small 
ESfDELs in the sequences flanking the targeted INDEL be- 
ing responsible for minute differences in electrophoretic dis- 
tances. Thus, we decided to evaluate PGR band sizes mainly 
manually, using fi-agment analysis to support the manual 
method. All raw data obtained by either method are com- 
piled in Supplemental Table 1. Furthermore, DNA sequenc- 
ing confirmed that each of the 22 INDEL primer sets en- 
abled amplifying PGR products from a syntenic locus shared 
by multiple genome types of Oryza species (an example of 
sequence alignments is shown in Supplemental Fig. 2), indi- 
cating the reliability of these markers. Only one of the 22 
INDELs was derived from an exonic region (Gh04-276G), 
four were from intergenic regions (Gh04-312W, Gh05- 
109G, Gh05-202W and Gh06-300W) and all remaining 
INDELs were fi-om intronic regions (data not shown). 

Phylogenetic analysis 

Phylogenetic analysis using the 22 INDEL markers se- 
lected above successfijUy divided 102 wild accessions into 9 
genome groups (Fig. 2). Here, the taxonomy of the genus 
Oryza depended on Vaughan and Morishima (2003). Thirty- 
eight accessions of AA-genome species were divided into 3 
phylogenetic groups: the group including rufipogon, barthii 
and glumaepatula, the meridionalis group and the 
longistaminata group. Thirty-seven accessions of BB, GG, 
BBGG and GGDD genome species, the "officinalis 
complex" (see below), were divided into five independent 
but closely related groups (Fig. 2). Fourteen accessions of 
O. latifolia, O. alta and O. grandiglumis , all classified into 
the GGDD tetraploids, were divided into 4 closely-related 
groups (Fig. 2). The accessions of latifolia and/or alta segre- 
gated into all 4 groups. All O. grandiglumis accessions were 
in a single group with two accessions of alta or latifolia 
(Fig. 2). The officinalis complex groups was first connected 
with EE-genome species and then with the AA-genome in 
the phylogenetic tree (Fig. 2). In contrast, GG, FF and HHJJ 
taxa were distant to AA taxa (Fig. 2). 



A minimum set of INDELs to discriminate 9 genome types 

This study revealed that seven INDEL markers (colored 
red in Fig. 1) were sufficient to discriminate 9 genome types 
and several species in the genus Oryza. 

Four INDELs enabled classification of wild species into 
each genome group by PGR and agarose gel electrophoresis 
(Fig. 1 and Supplemental Fig. 1). First, Gh06-306W gave 
PGR bands characteristic of the GG, FF, GG and HHJJ 
genome groups. HHJJ species were separable into O. ridley 
and O. longiglumis, allowing consequent determination of 
O. brachyantha, since the FF genome is composed of a 
single species. Second, Gh07-233W characterized diploid 
O. punctata (BB), diploid GG genome species and tetraploid 
BBGG species. This marker also discriminates O. meridionalis 
from other AA species. Third, Gh04-312W clearly discrimi- 
nated the EE species, O. australiensis and GGDD species 
fi-om other species. O. longistaminata was discriminated 
from other AA species by this marker. Additional use of 
Gh05-070W enabled O. grandiglumis to be distinguished 
fi-om other GGDD species. 

Validation by fragment analysis further enabled several 
genome groups and species to be distinguished (Supplemen- 
tal Table 1). Gh02-343W made O. longistaminata clearly 
separable from other AA species, and Gh06-300W and GhlO- 
044G distinguished tetraploid O. punctata from O. minuta 
(both BBGG). 

Discussion 

In this study, we developed the 22 INDEL markers from the 
public information of over a million of BESs by means of 
silico selection and subsequent PGR-based assessments. 
They are co-dominant markers reliably and rapidly to dis- 
criminate all genome types and several species in the genus 
Oryza. 

Phylogenetic analysis of 102 wild accessions also 
confirmed the reliability of the markers. In this study, 
O. rufipogon, O. barthii and O. glumaepatula were unable to 
be separated (Fig. 2). This result is comparable to those of 
previous studies, in which the AA-genome species lack clear 
morphological characteristics, except for O. longistaminata 
(reviewed in Vaughan and Morishima (2003)). A sole 
grouping of O. meridionalis accessions is consistent with the 
results obtained using restriction fragment length polymor- 
phisms (RFLPs) and short interspersed elements (SINEs) 
(Doi et al. 1995, Wang et al 1992, Xu et al 2005). The 
longistaminata accessions displayed intraspecific amplicon 
polymorphisms for an INDEL marker (asterisked in Fig. 1), 
probably representing genetic diversity of this species previ- 
ously reported (Kiambi et al 2005, Oka 1988). 

BB, GG, BBGG and GGDD genome species are distribut- 
ed widely in Asia, Australia, Gentral and South America and 
Africa and have been called the "officinalis complex" be- 
cause of diverse but relatively similar morphology (Tateoka 
1962). RFLP analysis in a previous study revealed that these 
species are closely related (Wang et al. 1992), corresponding 
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Fig. 2. Phylogenetic tree of 102 wild Oryza accessions and two cultivars obtained using 22 INDEL markers. This phylogenetic tree was depicted 
by the UPGMA method (Sokal and Michener 1958), based on data shown in Supplemental Table 1. 
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to the results of this study. Three CCDD tetraploids, the 
O. latifolia, O. alta and O. grandiglumis, shared almost 
INDEL patterns with each other, resulting in all these acces- 
sions being in 4 closely-related phylogenetic groups (Fig. 2). 
This result seems to be consistent to a previous report that 
these three species are closely related in terms of taxonomy; 
latifolia and alta were distinguished from each other only by 
the size of spikelet, and that grandiglumis was distinguished 
from latifolia and alta only by large glumes (Morishima and 
Martin 1994). EE genome species were closer to the 
officinalis complex species than GG, FF and HHJJ species, 
consistent with a previous proposal in which the EE genome 
is closely related to the DD genome progenitor (Ge et al. 
1999, Wang et al. 1992). The phylogenetic results in this 
study largely corresponded to those in the previous studies, 
indicating the reliability of the INDEL markers in identifica- 
tion of wild species of rice. 

Phylogenetic analysis in this study also revealed possibil- 
ity of several mistakes in registration of wild accessions con- 
served in the NIG collection. W1525 was classified as dip- 
loid O. eichingeri (CC) and WOO 18 and WOO 19 as tetraploid 
O. alta and O. latifolia (CCDD), respectively. However, the 
pattern of PCR amphfication by three INDEL markers, 
Ch07-233W, Ch03-128W and Ch04-312W (Supplemental 
Fig. 1), suggests the possibility that these three accessions 
should be classified as tetraploid BBCC species. In addition, 
W1474, deposited as tetraploid O. punctata (BBCC), exhib- 
ited the INDEL pattern of diploid O. punctata (BB) (Supple- 
mental Fig. 1). These problem accessions were involved in 
the clades distinct from the expected clades in the phyloge- 
netic tree (shaded in Fig. 2). In case of W1525 and W1474, 
we extended the number of plants examined and confirmed 
by INDEL marker-assisted and flow-cytometric analyses 
that each of these accessions actually segregated into two 
different ploidies (Supplemental Fig. 3). These results clear- 
ly demonstrate the reliability of the INDEL markers devel- 
oped in this study. In the current database, W1805 from Sri 
Lanka is deposited as O. rhizomatis, but was previously 
deposited as O. eichingeri (http://www.shigen.nig.ac.jp/rice/ 
oryzabaseV4/strain/wildCore/detail/36). However, the haplo- 
type of the 22 INDELs in this accession is identical to that in 
African O. eichingeri accessions (Fig. 2). These problematic 
accessions need to be reclassified more carefully. 

Genetic resources are the common heritage of human- 
kind. However, some natural populations of Oryza species 
have been lost from their original habitats, mainly due to 
drastic environmental changes and human activities 
(Akimoto et al. 1996, Nonomura et al. 2010). In situ and ex 
situ conservation strategies have become important globaly. 
Wild Oryza species are quite diverse genetically and physi- 
ologically and sometimes form mixed populations of differ- 
ent species in their original habitats. Especially for ex situ 
conservation, it is critical to pay attention to maintaining the 
genetic reliability of strains by excluding any contamination 
from different species. The 22 PCR-based and co-dominant 
INDEL markers developed in this study will be powerful 



tools to help determinate species identity and genome types 
easily and to establish germplasm stocks with corroborative 
genetic information usefiil for experimental studies and rice 
breeding. 
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